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Optimal strategies of radial velocity observations in planet 
search surveys 
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ABSTRACT 

Applications of the theory of optimal design of experiments to radial velocity planet 
search surveys are considered. Different optimality criteria are discussed, basing on the 
Fisher, Shannon, and Kullback-Leibler informations. Algorithms of optimal schedul- 
ing of RV observations for two important practical problems are considered. The first 
problem is finding the time for future observations to yield the maximum improve- 
ment of the precision of exoplanetary orbital parameters and masses. The second 
problem is finding the most favourable time for distinguishing alternative orbital fits 
(the scheduling of discriminating observations). 

These methods of optimal planning are demonstrated to be potentially efficient 
for multi-planet extrasolar systems, in particular for resonant ones. In these cases, the 
optimal dates of observations are often concentrated in quite narrow time segments. 
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1 INTRODUCTION 

Since the discovery of the first ex trasolar planet orbitin g 
the main-sequence star 51 Pegasi (|Mavor fc Quelozlll995h . 
more than 300 planets orbiting other stars were found, and 
about 30 extrasolar systems containing at least two plan- 
ets are known (see The Extrasolar Planets Encyclopaedia 
by J. Schneider, www.exoplanet.eu). The majority of these 
planets was discovered using the radial velocity (hereafter 
RV) method. Many multi-planet systems show interesting 
dynamical behaviour like mean-motion resonances and apsi- 
dal corotation locks. The sharp understanding of dynamical 
regimes in these systems is of a great importance. For ex- 
ample, this information may provide significant constrain ts 
on the planetary migration processes (|Beauge et al.l l2006). 

Unfortunately, orbital parameters and masses of many 
of the extrasolar planets are still poorly known due to a lack 
and/or a non-uniform sampling of RV observations. Esti- 
mations of parameters may be too uncertain or alternative 
models of the systems may be allowed. Unfortunately, such 
uncertainties are especially inherent in mult i- planetary con- 
figurations. It seems that strictly justified strategies of allo- 
cating observations still are not widely used in the current 
practice of RV planet searches. The aim of the present work 
is to describe several mathematically strict ways of optimal 
planning of RV observations, based o n the general theory 



with particular attention to refining orbits in multi-planet 
configurations. In Section [5] several criteria of observing 
schedule optimality are discussed. In Section^ algorithms of 
optimal planning of RV observations are described in details. 
In Sectionf3] the region of their applicability is discussed. In 
Section [S] the efficiency of these algorithms is demonstrated 
using data for real planetary systems. 



2 OVERVIEW OF OPTIMALITY CRETERIA 

Let us denote N RV measurements of a given star by Wmcas.n, 
the variances of the corresponding RV errors by <Tmeas,n- 
We assume that the RV errors are statistically independent 
and follow Gaussian distributions with zero means and with 
specified variances. The timings of the observations are de- 
noted by t„. We also assume that the RV model (corre- 
sponding to a certain orbital configuration of the planetary 
system orbiting the star) is given by fi(t, 9), where 9 is the 
vector of d parameters 9 of the RV curve for a given star 
(the majority of these parameters characterises the plane- 
tary system orbiting this star). Based on the RV time se- 



ries (t 



, er me as,n) and on the model fj,(t,9), we can 



planning of KV observations, based o n tne general tneory 
of optimal design of experiments (e.g ., Ermakov et al.lll983l ; 
lErmakov fc Zhiglvavkiilll987l ; lBardlll974 chapter 10), and 
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construct an estimation 9* of the vector 9. Since the obser- 
vations suffer from random errors, the estimations 9* con- 
tains random errors as well. New observations would de- 
crease them. Our task is to find some 'optimal' schedule for 
these new observations, which would lead to a maximum 
improvement in the precision of the estimations. But firstly 
we should clarify the sense in which we consider a given 
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strategy of sheduling of the observations as 'optimal'. The 
treatment of the optimality is not unique. 

Let us write down the elements of the d x d Fisher in- 
formation matrix associated with the parameters 0: 

n — l °mcas,n uu i- uu 3 t — t n 

The Fisher information matrix characterises the degree of 
statistical determinability of the parameters 9. The larger 
is Q, the larger are the derivatives dj-i/dOi reflecting the 
sensitivity of the RV model to small shifts of the parame- 
ters, and the smaller is the domain of uncertainty asso ciated 
with 0. For example, it is well-known (|Lehmanlll983l . § 6.4) 
that if the estimation 6* is a (non-linear) least-squares esti- 
mation and the number of observations N grows infinitely, 
then 9* is asymptotically unbiased (i.e., its mathematical 
expectation KG* tends to the true values of 6), and its 
probability distribution tends to a multivariate Gaussian 
one with the variance-covariance matrix Var#* tending to 
the matrix C = Q 1 (that is, in the coordinate notation, 
(Var0*)ij = E(0*0*) - E0*E0* ~ Cy). When N is smaller, 
the distribution of the least-squares estimations may be sig- 
nificantly non-Gaussian. For example, it may be multimodal 
so that the data may allow several alternative orbital fits. 
In this case, the variance-covariance matrix of 9* does not 
characterise the confidence domain of 9 in advance, but the 
Fisher information matrix Q(0) still can be used to charac- 
terise the degree of 'peakyness' of the distribution of 9* in 
the given point 9. 

Therefore, the information matrix Q constitutes the ba- 
sis for a series of optimality criteria. For arbitrary timings 
Ti,T2,...,r m for m new observations, we can predict the 
new information matrix Q using the formula similar to ([1]), 
but containing m extra terms. To obtain an optimal sched- 
ule, we should maximize the elements of this matrix over n. 
Different criteria from this group pay different attention to 
dif ferent elements of th e informatio n matrix (see, e.g., §2.1 
by Ermak ov et al.|[l983l and §3.1 bv lErmakov fc ZhiglvavkiH 
1 19871 ). Let us select a few popular criteria of optimality that 
may suit our needs: 

(i) D- optimality. This criterion considers the determinant 
det Q (the D-information) as an objective function to be 
maximized. In the large-sample asymptotics (N — > oo), the 
quantity \/det Q = 1 / V det C is inversely proportional to 
the volume of the uncertainty ellipsoid associated with the 
estimations 9* . Therefore, the D-optimal schedule lead to a 
maximum decrease of this volume (provided the scheduled 
observations are performed). 

(ii) Generalised D '-optimality (D s -optimality). This crite- 
rion is used to refine the precision of certain function of 
the parameters 6. Introducing the vector of s quantities 
77 = r)(6) to be refined, we can calculate the matrix 

*=§2 C (§S) T . < 2 > 

which approximates asymptotically (for N — > 00) the 
variance-covariance matrix of rj(0*). We can also predict 
the new matrix K for the time after making the new ob- 
servations, by means of substituting in the eq. ([2]), instead 
of the matrix C, the matrix C = Q . The generalised D- 
optimality criterion seeks to minimize det K. 



(iii) L-optimality (linear optimality) . This criterion seeks 
to minimize certain linear combination of the elements of C. 
Mathematically, we need to minimize the quantity Tr(LC) = 
Yli j=i LijCij, where the positive definite or semi-definite 
matrix L is fixed a priori. This criterion may be used, for 
instance, to minimize a weighted average of the variances 
of the estimations. The criterion of L-optimality minimizes 
the mathematical expectation of the quadratic loss function 
J2i •=! LijA9iA9j, associated with the random errors A9 of 
the estimations 9* . 

There is some obstacle in the use of the criteria of op- 
timality discussed above. It comes from the fact that all 
of these criteria depend on the values of the parameters 9, 
which are unknown. To overcome t his obstacle, one of the 
follow ing approaches can be used (|Ermakov fc Zhiglvavkiil 
1 19871 . §§5.2,5.3): 

(i) Sequential approach. In this approach, we simply sub- 
stitute the current estimations 9* . Therefore, after obtain- 
ing more and more measurements, we refine the estimations 
together with the optimality criterion. 

(ii) Bayesian approach. In this approach, we average the 
adopted objective function using some weight function of 9. 
This weight function may represent the current prior (poste- 
rior with respect to obserevations which are already made) 
distribution of 9. The averaged objective function can be 
further maximized (minimized) to obtain the corresponding 
optimal schedule. 

(iii) Minimax approach. In this approach, we maximize 
(minimize) the minimum (maximium) value of the objective 
function within the domain to which 9 is supposed to belong. 

From the computational view point, the sequential approach 
is more efficient, since it does not require extra integrations 
or maximizations. However, it requires a bigger amount of 
'priming' observations, which are needed to obtain a definite 
and, desirably, nearly Gaussian, starting estimation 9* . The 
Bayesian approach is formally free from this limitation, but 
it has two well-known disadvantages: the complexity of the 
appearing integrals over 9 and the ambiguity concerning the 
choice of prior distributions. The main disadvantage of the 
minimax approach is the presense of extra multi-dimensional 
optimisation, which is easy to perform in simplest cases only. 

There is another group of optimality criteria, which deal 
with the Shannon information (negative Shannon entropy) 
/ = — J p(9) In p(9)d9, associated with certain probabil- 
ity density p{9). The Bayesian approach of maximizing the 
Shannon information associate d with the p osterior distri- 
bution of 9 was considered by iFordl {2008). This method 
also suffers from the two mentioned disadvantages of the 
Ba yesian algori thms. Although for some simplified RV mod- 
els IFordl (|2008h proposed a way of decreasing the amount 
of calculations required in extra integrations, the abilities 
of the Bayesian criteria still remain limited, especially for 
multi-planet systems with large numbers of free parameters. 

Since the uncertainty of the estimations 9* decreases for 
larger time series, the sequential and Bayesian approaches 
become asymptotically equivalent when N — > 00. The same 
property of the asymptotic equivalence holds true for the 
D-optimality criterion and the Shannon information crite- 
rion. The reason for this equivalence comes from the fact 
that when N — > 00, the distribution of 9 tends to the mul- 
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tivaraiate Gaussian one with Var#* ~ C. The resulting 
Shannon information can be uniquely expressed via the D- 
information, according to the relation / = In \/det Q + const . 
Therefore, for relatively large N, the best choice is to use 
the sequential optimality criterion for refining the orbital 
configurations of planetary systems. 

In RV planet searches, we often deal with two or more 
roughly equally likely orbital solutions for the planetary 
system. The optimality criteria discussed above are insen- 
sitive to this multiplicity. They favour to futher increas- 
ing of the 'peakyness' of the modes of the \ 2 function 
or of the likelihood function (or of the posterior distribu- 
tion of 0), but they may be less useful for discriminating 
one of the peaks. In the case of multiple alternative or- 
bital fits, we should use anouther optimality criterion, which 
is based on the Kullback-Leibler discriminat ing informa- 
tion (see_55. 5 bv lErmakov fc Zhiglvavkiilll987l and §10.5 by 
lBardlll974h . Before we describe this criterion, let us intro- 
duce some extra definitions. Given the current estimations 
= 0* , we can make a prediction of the RV at any time t 
as v = fi(t, 0*). Since 0* incorporate random errors, the RV 
prediction should also contain a random component leading 
to an uncertainty of the predicted value of v. For each or- 
bital fit we may construct its own RV prediction. After that, 
we can define the Kullback-Leibler informations 
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Here, the probability densities pi,2(i>) describe the distri- 
bution of the RV prediction for the one of two orbital fits. 
It is not hard to see that the quantities (|3} represent the 
mathematical expectations of the likelihood ratio statistic 
considering the first or the second model as true. To obtain 
optimal dates for discriminating observations, we need to 
maximize /all ; or some their combination. 

In this paper, we aim to describe rules of optimal plan- 
ning of RV measurements for multi-planet extrasolar sys- 
tems. For such systems, the initial amounts of observations, 
needed in the sequential approach, are usually available. 
Thus we adopt below the sequential approach to consruct 
detailed algorithms for D- and L-optimal scheduling and 
for optimal scheduling of discriminating observations. 



3 ALGORITHMS OF OPTIMAL SCHEDULING 
3.1 D-optimal scheduling 

If we obtain an extra, (N + l) th , RV observation at time t, 
the precision of the estimations 0* increases. The new in- 
formation matrix Q is defined by the formula similar to ([1]), 
but containing an extra, (N + 1) , term in the summation. 
The D-optimality criterion seeks to find r that provides the 
maximum value of det Q. Let us write down the asymptotic 
(N — > oo) variance of the RV prediction v, using the follow- 
ing analogue of the formula © : 



' prcd 
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»J=1 



(4) 



According to (Bard 1974, § 10.3), the maximization of det Q 
is equivalent to the maximization of the RV prediction vari- 
ance o"p re d(i"), which should be calculated in the approx- 



imation Q. That is, the optimal time for the future RV 
measurement corresponds to the most uncertain RV predic- 
tion. This rule is quite clear: to improve our knowledge, we 
need to make observations when our predicting abilities are 
mostly limited. On contrary, we profit little from observa- 
tions made when the observed quantity is well predictable. 

When merging data from several observatories, it may 
be more useful to maximize the full variance of the deviation 
of the future measurement from the RV prediction: 



ct 2 (t) = (Jp rcd (r) + a 
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(•>) 



where ff^ cas is the variance of the future RV measurement 
(expected for a given observatory). At last, we define the 
non-dimensional function J 2 (r) — detQ/ det Q, which can 
be transformed to a more simple form 



J 2 (r) 



detQ _ <t 2 (t) 
det Q o-2 lcas 



= 1 + 



7 pred( r ) 



> 1. 



(6) 



This relation is derived in ijBardl Il974l . §10.3) and in the 
Appendix IA1 of the present paper. The identity (|6]) means 
that the value of J(r) tells us how much the volume of the 
uncertainty ellipsoid associated with 0* would decrease after 
making the extra observation at time r. 

Often, it is not necessary to refine the whole set of pa- 
rameters 0. Instead, we may be interested in improving the 
precision of only some of these parameters or in refining a 
certain function of 0. For instance, we have no direct need 
in refining the estimation of the velocity of the barycentre 
of the planetary system (the constant velocity term in the 
RV model). We may want to refine orbital elements of only 
certain planets in the system. We may want to refine only 
some combinations of the parameters of the system. 

Let us assume that we need to improve the precision 
of the vector r) = f](0). Linear (asymptotic N — + oo) ap- 
proximation to the variance-covariance matrix of r] is given 
by ([2]) The matrix K can be defined using the same formula 
but with C changed by C. Now we need to minimize det K 
instead of det C. In the case when the matrices K and K are 
not degenerated, we can extend the definition of J according 
to J 2 (r) = det K/ det K. As it is shown in Appendix IA1 the 
function J(r) can be again rewritten in a more simple form: 



j2 ( , cr (r) 



(7) 



where cr^(r) is the conditional variance of the difference (RV 
prediction — actual future RV measurement), calculated un- 
der condition of fixed 7). This conditional variance can be 
calculated using the formulae @ and ([SJ, but substitut- 
ing, instead of the matrix C, the corresponding conditional 



variance-covariance matrix C v of 0* 



A T K X A 



with 



00 



(8) 



Here, the matrix A represents the asymptotic approximation 
to the cross-covariance matrix of r\ and 8. Note that when 
r) = 0, we have A = K = C, C,, = 0, and a v — <7 mC as, as we 
could expect. If the parameters in rj represent a subset of the 
parameters in then the matrices K and A represent certain 
submatrices of the matrix C and the calculations are much 
simplified. In this case, all of the elements in the matrix C,, 
are zero, except for those corresponding to the variances and 
mutual correlations of the parameters rj. 
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Some difficulties arise when the matrix K is degener- 
ated. This may take place when some of the variables in 
r) are dependent and, hence, the matrix drj/dG is not of 
full rank. Such a case quite can be met in practice and we 
need to process it correctly. In this case det K = det K = 
and r < dimrj axes of the uncertainty ellipsoid of 77 van- 
ish. However, the volume of this ellipsoid in the subspace 
of the resting (dim rj — r) axes does not vanish yet. This 
volume is proportional to the product of all non-zero eigen- 
values of K (recall that the product of all eigenvalues of a 
matrix is equal to its determinant). Denoting this product 
as VK, we can define J 2 (r) = VK/VK. Note that this re- 
vised definition incorporates the previous one as a special 
case. Again, the function J(r) can be transformed to the 
more simple form @. However, now we cannot calculate 
the inverse matrix K -1 in the eq. JH). Instead, we should 
substitute the pseudoinverse matrix K + . This pseudoinverse 
matrix can be calculated via the eige ndecomposition of K 
(see, e.g., Appendix A in l|Bard|[l974l ) for a brief summary 
of this procedure and further references) . 

In the general case, we need to find the time range when 
the values of the function (|7J are large. The physical sense of 
this rule is intuitively clear again: to improve the precision 
of a given set of parameters, we ought to make observations 
when the uncertainty of our prediction is large, in compari- 
son with the same uncertainty calculated under assumption 
that the parameters to be refined are known exactly. 

It is worth noting that the variances a-p rcd (r), 
o"^ pred (r), and, hence, the values of J(r) are invari- 
able with respect to arbitrary smooth non-degenerated re- 
parametrization. That is, if we define the transformations 
of parameters 61 = 61(6) and T71 = r]i(ri) having non-zero 
Jacobians (i.e., det(d0i/dd) / and det(<9?7i / drj) / 0) in 
the point 9* , the values of J(r) calculated by 6\ , rji would 
exactly coincide with those calculated by 6, 77. 



with the matrix L changed by M. Note that now possible 
degeneracy of the matrix K does not produce any obstacles. 

The L-optimal criterion is invariable with respect to 
non-degenerated changes of variables only if the matrix L is 
transformed in accordance: Li = ( J^y) T L(J^-)- 

3.3 Scheduling discriminating observations 

Let us now assume that we have two alternative RV models 
Hi(t,0i) and ^2(^,62) describing two different orbital con- 
figurations of the planetary system. The sets of parameters, 
6 1 and 62, do not necessarily coincide and the dimensions 
of the models, dim#i and dim 82, may be different as well. 
Using the approach from the previous subsection, we can 
calculate the predictions vi (r) and V2 (r) along with the full 
variances <j\(t) and o"|(r), according to ([5]). Then we can 
write down the ex pected infor mation for discriminating be- 
tween the models (|Bardlll974l . §10.5), Ji2(r), as 

The function represents the sum of the Kullback-Leibler 
informations i]i a and J 2 |i 5 calculated from Q under as- 
sumption that the distributions of Wj are close to Gaussian. 
The largest values of Ji%{t) correspond to the most promis- 
ing time for future observations to rule out one of the alter- 
native models. This rule means that we need to make RV 
observations when the two models imply largely different 
predictions of the radial velocity. Simultaneously, the uncer- 
tainties of these predictions should not be too large, in order 
to avoid statistically insignificant differences. The combina- 
tion takes into account both these requirements. 

The property of the invariance of J(r) with respect to 
a re-parametrization is valid for the function Jiz{t) as well. 



3.2 L-optimal scheduling 

We may also be interested in constructing an L-optimal 
schedule. Now the objective function to be minimized by r 
is Tr(LC). We can define the non-dimensional gain function 
1(t) = Tr(LC) / Tr(LC) to be maximized. Using identity (|A"2)) 
from Appendix [S] we can write down the expression 



1 



c(r) T Lc(r) 
a 2 (r)Tr(LC) 



(9) 



where the vector c(r) has elements a = YLj=i dj'§e L anc ^ 
represents the asymptotic covariation of 6* and the RV pre- 
diction at t = t. 

When we want the refine the vector of parameters 
we need to use some generalisation of the function ©. In 
this case, we need to minimize the function Tr(LK) by r. 
We redefine l(r) = Tr(LK)/ Tr(LK) and use the first of the 
formulae (I A3I) to write down 



(10) 



J_ _ o(r) T La(r) _ _ c(r) T Mc(r) 
1(t)~ ~~ a 2 (r)Tr(LK) ~ _ <j 2 (t) Tr(MC) ' 

where the vector a(r) having elements = Ylj=i ^-t/Je^ 
represents the asymptotic covariation of rf(0*) and the RV 
prediction at t = t. The matrix M is equal to (|^-) T L(|2.). 
This generalised L-optimality rule represents the usual one 



3.4 Scheduling multiple observations 

We may need to plan several observations simultaneously. 
This problem may arise, for instance, when we wish to plan 
(at least preliminarily) a whole set of observation allocated 
for a coming observing season for a given star. 

Let us denote the timings of m future observations by 
n, T2, . . . , T m - Using the same approach as in the previous 
subsections, we can make m RV predictions forming the vec- 
tor v (dim v — m) and calculate the full variance-covariance 
matrix V prc d of this vector. Thus the m x m matrix V prc d 
should contain the variances and cross covariations of the RV 
predictions. Also, we can calculate the conditional m x m 
variance-covariance matrix V^.prcd of the vector v, taken un- 
der condition of fixed rj. Corresponding variance-covariance 
matrices of full deviations of the future RV measurements 
from their predictions can be calculated as 



V 



sl+V 



prcd 7 



5 I ~\~ ,prcd 



(12) 



with I being the identity matrix. Now we can write down 
the extension of the function J(t) to multiple observations: 



J (n,T2, 



VK 



detV 



T>K det V„ 



> 1. 



(13) 



The generalisation of the function l(r) can be calculated 
according to the equality 
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1 _ Tr(LK) _ Tr^W) 

l( Tl ,r 2 ,...,r m ) ~ Tr(LK) Tr(LK) ' V ' 

where the matrix W is calculated in the same way as the 
variance-covariance matrix V prc d of RV predictions but sub- 
stituting, instead of the matrix C, the matrix CMC (with M 
given in Section [3. 5) . The generalised discriminating infor- 
mation looks like 

Jia(n,7a,...,T m ) = -m+iTr(V 1 ; 1 V 2 +V 2 - 1 V 1 ) + 

+i(u 2 - «i) T (Vr 1 + V2 1 ) (v 2 - in), (15) 

where subscripts of V and v refer to the alternative models. 

The main difficulty in using time allocation rules based 
on the functions (|13H15p may be connected with too large 
number of timings r, to be found. Perhaps, there is no big 
obstacles to scan a two-dimensional grid of (n,T2) directly 
when m = 2. For m 3, such direct scanning requires too 
intensive computations and is not practical. We may use 
here the following algorithm. At first, the one-dimensional 
objective function $1(1-1) is constructed (it may be J(ri), 
£(ri), or Ji2(ti), depending on our aims). Using the direct 
one-dimensional search of the maximum, the date ri for one 
of the future RV measuments is obtained. Then the two- 
dimensional planning function, $2(1-1, t%), is plotted, but 
the date n is fixed at the value obtained in previous step. 
Therefore, again we can use a one-dimensional search of the 
maximum. As a result, we obtain the second date t%. The 
values of n and r 2 may be further adjusted using some non- 
linear maximization algorithm. Then we construct the func- 
tion 3>3(ti,T2,T3) with n and T2 fixed, obtain the optimal 
value of T3, adjust the whole array Ti,T2,T3, and so on until 
the full set of m optimal dates is found. We may break this 
sequence if an extra observation does not provide enough 
gain. The released time can be used to observe other stars. 



4 APPLICABILITY OF THE ALGORITHMS 

Of course, there is no statistical method that can be applied 
in every practical situation. The algorithms described above 
require the following conditions to be satisfied: 

(i) The estimations 6* are obtained using the least- 
squares approach or the approach used to obtain them is 
verified to be equivalent in the sense of planning the obser- 
vations. 

(ii) The number of existing observations should be suffi- 
cient, so that the estimations 0* are (approximately) unbi- 
ased and their joint distribution can be approximated by a 
multivariate Gaussian one. 

(iii) The equations of the RV model, jj,(t,0), and of the 
parameters to be refined, rj(8), can be linearised in the un- 
certainty ellipsoid surrounding the vector of the estimations. 

The first condition is usually satis fied in prac tice. Here, 
it is worth mentioning the paper (|Baluevl [2008b ) where an 
approach, other than the least-squares one, was proposed 
for determination of orbital parameters and masses of ex- 
oplanets. This maximum-likelihood approach incorporates 
a built-in estimation of the so-called RV jitter to be taken 
into account in the estimations of planetary parameters. A 
careful analysis shows that the modification of the rules of 



optimal planning for this case is easy and straightforward. 
This is provided by the fact that the cross elements in the 
Fisher information matrix, corresponding to the estimations 
of and of the RV jitter, vanish. In fact, the rules of plan- 
ning remain the same, but with the clause that the RV jitters 
should not enter in the vector 6. Instead, they should only 
be added to the values of ff^ cas . 

The second condition is satisfied for well-conditioned 
situations, when there is a single clear best-fitting orbital 
model of the system or the alternative models are well sepa- 
rated in the parametric space. The cases when RV data allow 
multiple locally best-fitting orbital solutions located suspi- 
ciously close to each other, correspond to the ill-conditioned 
situation. In such cases, no one of the alternative orbital 
solutions may represent a good approximation to the real 
configuration of the system. The local maxima of the likeli- 
hood function (or local minima of the weighted r.m.s.) rep- 
resent only 'ripples' produced by the lack of the data. The 
formal uncertainties of the estimations may be unrealistic 
and underestimated. New RV measurements may change 
the orbital solutions dramatically. In this situation, the es- 
timations of parameters of a planetary system are strongly 
biased and their distribution is far from the Gaussian one. 
Then the optimal planning algorithms should be used with 
care. We should track the sensitivity of their results to what 
orbital solution we adopt, to the functional model of the RV 
curve, to the set of free parameters. Seemi ngly, the extra- 
solar systems HD 82943 ijlVIavor et all 120031 ') and HD37124 
|Vogt et al.ll2005l ) may represent such cases. 

Different methods can be used to assess the reliability 
of orbital fits, needed to justify the use o f the scheduling 
rules described above. iBeauge et al.l (|2008h performed a se- 
ries of orbital fits of the system of HD82943 with truncated 
RV datasets, in order to prognose the sensitivity of cur- 
rent orbital configuration to future RV measureme nts. An- 
other, probably more rapid, approach was used by iBaluevi 
l|2008j) for the system of HD37124. It uses the condition 
number of the Fisher information matrix Q (or, speaking 
more precisely, of the scaled information matrix having ele- 
ments Qij J \J QaQjj) to assess the degree of 'ravineness' of 
the graph of the likelihood function. 

The third condition is tightly connected with the sec- 
ond one. For robust cases, both these conditions should hold 
true asymptotically, when N grows (because then all uncer- 
tainties tend to zero). For a finite N, the temporal region of 
their validity is limited, mainly due to the non-linear depen- 
dence on orbital periods of planets. It is admissible to use 
the linear methods of optimal scheduling during the time 
much less than the span of the RV time series (say, less than 
one third of the total time span). Attempts of prediction of 
optimal dates in a more distant future may be unsafe. 

Not every kind of non-linearity and non-gaussianity 
of the parameters can make the linear theory of plan- 
ning the observations unreliable. For instance, when the or- 
bital eccentricity of a planet is small, its distribution may 
be non-Gaussian, though the orbital configuration is well- 
determined. This is a typical situation for the systems con- 
taining a hot Jupiter planet: the true orbital eccentricity e of 
the hot Jupiter may be so small that even a very precise de- 
termination is unable to detect its deviation from zero. The 
formal (linear) estimation may look like e = 0.01 ± 0.01, 
implying the argument of the periastron ui is ill-determined. 



6 R. V. Baluev 



In this case, we can make the following change of variables: 
x = e cos uj,y = esinu. The dependence of the radial veloc- 
ity on the new pair of parameters x, y is almost linear and all 
necessary equations can be perfectly linearised with respect 
to these new variables. The joint distribution of estimations 
of x, y is much closer to a bivariate Gaussian (peaked near 
zero). In practical calculations, it is not necessary to perform 
such change explicitly, thanks to the invariance property of 
the functions J(r) and Ji2(i~). We can calculate J(j) and 
J12 (t) in the common way, using linear approximations with 
the initial (non-linear and non-Gaussian) set of parameters. 
As it was discussed above, the result is exactly the same as 
for the new (almost linear and Gaussian) set. 

We also note that rules described in Section[3]do not ac- 
count for possible sunlight or moonlight contamination: the 
formal maximum of J(r) or Ji2(i~) may lie beyond the ob- 
serving window of a given star. Therefore, we should search 
for maximum of J and J12 within the admissible dates only. 



5 APPLICATIONS 

5.1 Gliese 876: an orbital resonance 

The first (Jov i an) pl anet in this system was discovered by 
iDelfosse et all (|l998T ) using the s pectrograph E LODI E and 
shortly after this confirmed by iMarcv et alj |l998) , bas- 
ing on RV observat ions at Keck observatory. Some further, 
IMarcv et al.l (|200ll ) announced the second Jovian planet, 
which was trapped in the 2/1 mean-motion resonance. Now 
this system is believed to host three planets: the very 
low-mass (~ 7.5 Earth mass) planet d on a short-period 
(Pd « 2 days) orbit and two Jovian planets b and c, trapped 
in the 2/1 me an-motion resonanc e with P c ~ 30 days and 
Pb ~ 60 days (jRivera et alj 120051 ). It is important that the 
gravitational interactions between planets b and c were di- 
rectly observed in the RV curve: after ~ 8 years of observa- 
tions, the orbital periastra of these planets have completed 
a full revolution. It makes possible to determine the incli- 
nation of the system to the sky plane (it was estimated by 
about 50°), but simultaneously it introduces extra statistical 
uncertainties and correlations between different parameters. 
In addition, the orbital period of the planet c is close to 
lunar cycles. All these facts make precise determination of 
orbital parameters and masses in the system more difficult. 
Although currently the orbits in this system are constrained 
well, we may be interested in further refining the estimations 
and suppressing the correlations between different parame- 
ters of this system. 

We adopt the three-planet RV model taking into ac- 
count planetary perturbations. The orbits are assumed to 
lie in a common plane, and its inclination to the sky plane 
is treated as an extra free parameter. Therefore, the RV 
model have total of 17 free parameters 0: four osculating 
orbital elements and mass for the planets b, five similar pa- 
rameters for the planet c, five ones for the planet d, the 
orbital inclination and t he constant velocity term. The RV 
dataset was published in (R ivera et alj|200a ) and consists of 
N — 155 Keck RV measurements with typical internal RV 
uncertainties ~ 4 m/s. After obtaining the best-fitting es- 
timations 0* , we can apply the algorithm of the D-optimal 
scheduling discussed above. Here and in further examples, 



to obtain a more accurate estimation of the RV error vari- 
ance (Tmeas, expected from the observation being scheduled, 
we use the maximum-li kelihood appro ach of estimating the 
RV jitter, described in |Baluevll2008bT) . 

In Fig.[TJ the function J(r) for this system is plotted, for 
the case when 77 incorporates the 10 parameters of planets 
b and c and the common orbital inclination (case I) , and for 
the case when r) incorporates only the common orbital incli- 
nation (case II). We can see that these graphs are peaked. 
The sequence of peaks shows approximately monthly pe- 
riodicity, probably due to an interaction between the RV 
periodicities inspired by the resonant planet 'c' and 'b' and 
the aliasing inspired by lunar cycles. During the observing 
season following the last published observations, the heights 
of the largest peaks corresponded to about 30% decreasing 
of the total volume of the uncertainty ellipsoid of t] (in the 
case I) and about 10% increasing of the accuracy of the 
orbital inclination (in the case II). Such increase of preci- 
sion would be provided by a single RV measurement only. 
It is important that these peaks are not necessarily centered 
in the regions where the observations cannot be done due 
to the moonlight contamination. Typically, the peaks have 
semi-widths of about one week only. If made three days be- 
fore or after a maximum of J(t), the observations would 
not yield much gain. We can see that, during the last season 
when the star was observed, some peaks of J(r) were not 
covered by observation^]. Possibly, the observations might 
be distributed more efficiently, if this graph was constructed 
at that time. Also, a series of tempting peaks of J(r) can be 
seen in further seasons. However, the probability to observe 
in these narrow segments is quite small until the optimal 
planning algorithms are not used. 

We can see that, according to the graphs in Fig. [1] the 
most tempting time ranges for making the RV observations 
in the nearest future are located in the end of August, 2008 
and in the end of October, 2008. Unfortunately, any more 
precise statement of the optimal dates would be unreliable, 
because our analysis incorporates only published RV mea- 
surements of 3 — 4 years old, whereas their time span is only 
8 years. It may be dangerous to predict the optimal dates 
which are spaced from the last real measurement by about 
half of the actual time span. Moreover, it is possible that 
the actual RV measurements spanning the last three observ- 
ing seasons of GJ876, significantly shift the positions of the 
peaks of J(r). Nevertheless, the existence of favourable ob- 
servation dates for GJ876 in the nearest future is clear. The 
precise dates can be determined on the basis of the up-to- 
date RV dataset, including the unpublished measurements 
taken in the observing seasons of 2005-2007. 

5.2 HD208487: an alias ambiguity 

The first, roughly half Jupiter mass, p lanet in this system 
was discovered by iTinnev et alj (J2005) on the basis of RV 
observations made at the Anglo-Australian Observatory. Its 

1 As it can be seen in Figs.[T]and[2] the actual RV observations al- 
ways fall near minima on the graphs. This fact does not mean that 
all these observations were sampled so inefficiently. This means 
that extra observations would yield little gain in these positions, 
because they would only duplicate the existing observations. 
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Figure 1. The graphs of the function J(r) for the planetary system of GJ876. The broken segments of the graph correspond to the time 
when observations are impossible (when the star is absent on the night sky or is contaminated by the moonlight). The crosses on the 
time axis mark positions of several last RV measurements. Top panel: the vector rj incorporates 10 parameters of the planets b and c and 
the common orbital inclination to the sky plane. Bottom panel: incorporates only the orbital inclination. See text for more details. 
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Figure 2. The graph of the information for discrimination between two orbital models for the star HD208487 described in the text. The 
thin broken segments of the graph correspond to the time when observations are impossible. The crosses on the time axis mark positions 
of three last RV measurements. 



orbital period is close to 130 days. iGregorvl (|2005l . 120071 ) 
found an extra periodicity in the RV data for this star. The 
orbital period of the putative second planet was estimated 
by about 1000 days, and its mas s was estimat e d to b e close 
to the mass of the first planet. IWright et all (|2007f ) found 
two alte r native solutions, one is consistent with that from 
IGregorvl (|2007l ), and another one corresponds to the orbital 
period of the second planet of about 28.6 days. 

The periodogram of the latest RV data for HD208 487 
(N = 35 measurements published bv lButler et al.|[2006l and 
having the internal precision ~ 5 m/s) indeed shows two 
high peaks, near 28.6 days and near 900 days. The differ- 
ence between the respective frequencies corresponds to a 
period of about 29.5 days, clearly indicating the aliasing 
connected with the full moon / new moon cycle. The false 
alarm probability associated with the higher peak (which 
is nea r 28.6 days) c an be estimated using analytic bounds 
from (Balucv 2008a) by less than 1%. Thus, some extra pe- 
riodicity is probably present, but it is not fully clear which 
periodogram peak is real and which one is its alias. Natu- 
rally, we might ask, when the new RV observations should 
be made for them to rule out one of these alternatives. 

In Fig. [2] the corresponding graph of the function J\2(j) 
is plotted. It was constructed using two best-fitting double- 
Keplerian models of the RV curve, with the orbital period of 



the putative second planet near 900 days or near 28.6 days. 
In both models, the number of free parameters of the fit was 
equal to 11 (5 and 5 usual parameters of the Keplerian RV 
variation for the two planets plus constant velocity term). 
We can see that, in the last observating season in 2005, 
there were many good time segments when extra observa- 
tions would be highly desirable. In these peaks of Ji2(r), 
the RV predictions for different orbital models diverged by 
up to ~ 20 m/s. A single observation placed in one of these 
peaks could rule out one of the alternative models at the 
significance level of ~ 2 sigma or even more. However, three 
RV measurements actually made during this season did not 
cover the peaks of Ji2(i~). It is remarkable that the subse- 
quent observing seasons offer less opportunities of discrimi- 
nation between the models. 



6 CONCLUSIONS 

This paper demonstrates how the general tools of the theory 
of optimal design of experiments can be used for planning 
RV observations in planet search surveys. Two important 
practical problems were considered. In the first one, the ob- 
servations are required to produce the largest improvement 
in the precision of estimations of exoplanetary parameters. 
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In the second one, the observations are planned to yield 
maximum information for distinguishing between two alter- 
native orbital models of an exoplanetary system. 

These optimising tools are demonstrated using RV data 
for several real planetary systems. It is shown that these 
tools may significantly increase the efficiency of observa- 
tions in RV planet search surveys. They would be espe- 
cially useful for multi-planet extrasolar systems, in partic- 
ular for systems containing planet pairs in a mean-motion 
resonance, for many of which the orbits are still poorly deter- 
mined. They include, among others, the well-known systems 
of GJ876, HD82943, and possibly HD37124. The algorithms 
of optimal scheduling may also be useful in resolving ambi- 
guities concerning planetary orbital configurations, e.g. the 
alias ambiguity for the HD208487 system. 
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APPENDIX A: SOME MATRIX RELATIONS 

Let us consider the matrix Q given by {1} and the matrix 
Q = Q + gg T /c^casi where the elements of the vector g 
are given by gi = dii/dOj. Usi ng the identity det(l + AB) = 
det(l + BA) from (|Bardlll974l . Appendix A), which is valid 
for arbitrary matrices A and B with matching dimensions, 
we can obtain the relation 



^g=det(. ■ 
detQ 



= 1 + 



9 T Q 1 9 



(Al) 



which directly implies Also, we can use the identity 
fA + xy T )- 1 = A" 1 - A- 1 a :i/ T A~ 1 /(l + y T A _1 a;) from 
( Ermakov fc Zhiglvavkiilll987l . Appendix 1), which is valid 
for arbitrary matrix A and vectors x,y with matching di- 
mensions, to obtain 



Q + 



99 



Q 



(A2) 



where c = Q 1 g. The relation @ is a direct consequence of 
the latter equality. Using (|A2[I . we can also calculate 

K = K-^£, ^5 = 1-^^, (A3) 



a 2 ' detK a 2 ' 

where a = Ag with matrix A given in (JHJ . The first of equa- 
tions (|A3|) implies the relation (|10|l , and the second one im- 
plies the relation 

When K and K are degenerated due to dependency of 
the parameters r], we can easily check the equality 

^ (A4) 



VK 



= 1 - 



VK 

using (|A3[) and the eigenvalue decompositions of the ma- 
trices K and K, also bearing in mind that the orthogonal 
matrix, needed to transform K to a diagonal form, coincides 
with that for K. 

This paper has been typeset from a Tp^X/ F>TjtX file prepared 
by the author. 



